Using automated planning for improving data mining processes

نویسندگان

  • Susana Fernández
  • Tomás de la Rosa
  • Fernando Fernández
  • Rubén Suárez
  • Javier Ortiz Laguna
  • Daniel Borrajo
  • David Manzano-Macho
چکیده

This paper presents a distributed architecture for automating data mining processes using standard languages. Data mining is a difficult task that relies on an exploratory and analytic process of processing large quantities of data in order to discover meaningful patterns. The increasing heterogeneity and complexity of available data requires some expert knowledge on how to combine the multiple and alternative data mining tasks to process the data. Here, we describe data-mining tasks in terms of Automated Planning, which allows us to automate the data-mining knowledge flow construction. The work is based on the use of standards that have been defined in both data mining and automated-planning communities. Thus, we use PMML (Predictive Model Markup Language) to describe data mining tasks. From the PMML, a problem description in PDDL (Planning Domain Definition Language) can be generated, so any current planning system can be used to generate a plan. This plan is, again, translated to a data-mining workflow description, KFML format (Knowledge Flow file for the WEKA tool), so the plan or data-mining workflow can be executed in WEKA (Waikato Environment for Knowledge Analysis).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Concept drift detection in business process logs using deep learning

Process mining provides a bridge between process modeling and analysis on the one hand and data mining on the other hand. Process mining aims at discovering, monitoring, and improving real processes by extracting knowledge from event logs. However, as most business processes change over time (e.g. the effects of new legislation, seasonal effects and etc.), traditional process mining techniques ...

متن کامل

On Compiling Data Mining Tasks to PDDL

Data mining is a difficult task that relies on an exploratory and analytic process of large quantities of data in order to discover meaningful patterns and rules. It requires complex methodologies, and the increasing heterogeneity and complexity of available data requires some skills to build the data mining processes, or knowledge flows. The goal of this work is to describe data-mining process...

متن کامل

Assisting Data Mining through Automated Planning

The induction of knowledge from a data set relies in the execution of multiple data mining actions: to apply filters to clean and select the data, to train different algorithms (clustering, classification, regression, association), to evaluate the results using different approaches (cross validation, statistical analysis), to visualize the results, etc. In a real data mining process, previous a...

متن کامل

A Data Mining approach for forecasting failure root causes: A case study in an Automated Teller Machine (ATM) manufacturing company

Based on the findings of Massachusetts Institute of Technology, organizations’ data double every five years. However, the rate of using data is 0.3. Nowadays, data mining tools have greatly facilitated the process of knowledge extraction from a welter of data. This paper presents a hybrid model using data gathered from an ATM manufacturing company. The steps of the research are based on CRISP-D...

متن کامل

Automated detection of coronavirus disease (COVID-19) by using data-mining techniques: a brief report

Background: The clinical field has vast sick data that has not been analyzed. Discovering a way to analyze this raw data and turn it into an information treasure can save many lives. Using data mining methods is an efficient way to analyze this large amount of raw data. It can predict the future with accurate knowledge of the past, providing new insights into disease diagnosis and prevention. S...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Knowledge Eng. Review

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2013